Matrix-matrix multiplication on graphics processing unit platform using tiling technique

نویسندگان

چکیده

Today’s hardware platforms have parallel processing capabilities and many programming models been developed. It is necessary to research an efficient implementation of compute-intensive applications using available platforms. Dense matrix-matrix multiplication important kernel that used in applications, while it computationally intensive, especially for large matrix sizes. To improve the performance this kernel, we implement on graphics unit (GPU) platform tiling technique with different tile Our experimental results show approach improves speed by 56.89% (2.32× faster) against straightforward (STF). And size 32 has highest compared other sizes 8 16.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new sparse matrix vector multiplication graphics processing unit algorithm designed for finite element problems

Recently, graphics processing units (GPUs) have been increasingly leveraged in a variety of scientific computing applications. However, architectural differences between CPUs and GPUs necessitate the development of algorithms that take advantage of GPU hardware. As sparse matrix vector (SPMV) multiplication operations are commonly used in finite element analysis, a new SPMV algorithm and severa...

متن کامل

Parallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform

There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...

متن کامل

Automatic Tuning Matrix Multiplication Performance on Graphics Hardware By

Graphics hardware’s performance is advancing much faster than the performance of conventional microprocessor. In order to utilize the tremendous computing power of these systems, it is critical to tune software to graphics hardware’s architectural features. The frequent changes in GPUs’ architecture and performance characteristics makes it very desirable for such tuning to be automated. This pa...

متن کامل

Fast Reconstruction Technique for Medical Images Using Graphics Processing Unit

In many medical imaging modalities, the Fast Fourier Transform (FFT) is being used for the reconstruction of images from acquired raw data. The objective of the paper is to develop FFT and Inverse FT algorithms to run under GPU for performing in much faster way. The GPU based FFT implementation provides much faster reconstruction of Medical images than CPU based implementation. The GPU based al...

متن کامل

Matrix Multiplication Parallelization on a Many-Core Platform

This paper introduces an approach to analyze the power and energy consumption of a many-core system. The investigation has been done by using the Intel SCC system as an experimental platform. The approach is to collect the time and power profiling of an executing application on the Intel SCC system. And then, we find the total energy consumed for the entire execution. We studied the effects of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Indonesian Journal of Electrical Engineering and Computer Science

سال: 2022

ISSN: ['2502-4752', '2502-4760']

DOI: https://doi.org/10.11591/ijeecs.v28.i2.pp1012-1019